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(57) ABSTRACT 

A method for displaying an enhanced multimedia presenta- 
tion including personalized supplementary audio, video, and 
graphic content selectable by a user and rendered by a 
receiving device, comprises the steps of: communicating a 
multimedia presentation file to the receiving device, the 
multimedia presentation file comprising base multimedia 
presentation content and, frame-synchronized information 
including starting frame timing identifier, ending frame 
timing identifier, starting frame spatial coordinates, ending 
frame spatial coordinates, and motion vector specifications 
for describing frame-accurate location, motion and timing of 
the personalized supplementary audio, video, and graphic 
content, the frame-synchronized information indicating one 
or more free areas of the multimedia presentation absent 
significant base multimedia content; extracting the frame- 
synchronized information from the multimedia presentation 
file; retrieving the personalized supplementary content from 
the receiving device; decoding the personalized supplemen- 
tary content at a time sufficiently in advance of the starting 
frame timing identifier; and the receiving device selecting an 
indicated free area and initiating display of one or more 
items of the personalized supplementary content at frame- 
accurate times between the starting frame timing identifier 
and ending frame timing identifier at the frame coordinates 
in accordance with the frame -synchronized information. 

28 Claims, 4 Drawing Sheets 
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SYSTEM FOR VIDEO, AUDIO, AND to the viewers' display equipment, giving that equipment the 

GRAPHIC PRESENTATION IN TANDEM flexibility to add non-interfering graphics or audio when and 

WITH VIDEO/AUDIO PLAY where it sees fit, in an adaptive manner throughout a 

presentation, rather than at limited points. This ability will 

BACKGROUND OF THE INVENTION 5 allow the viewers' equipment to create a tandem video/ 

audio/graphics presentation without requiring viewers' 

1. Meld ot the Invention active participation in the presentation process. That system 
The present invention relates to the displaying of graphics must a u ow coordination of graphics content that is not 

objects such as text or sprites overlaying a multimedia pre-stored, such as broadcast news bulletins, and perform 

television presentation, and more specifically to the display 10 st j]| or animated graphics overlay of video, addition or 

of animated graphics or play out of video or audio coordi- replacement of video, and audio replacement in coordination 

nated with a multimedia presentation. ^ tne existing video and audio content of a presentation. 

2. Description of Prior Art 

Many video applications, including interactive and mul- 
timedia applications, take advantage of the video viewer's is xh e present invention is a system for the definition and 
equipment capability to display graphics overlays on the use of information which enables the display or playing of 
video screen such as a TV or a PC monitor. These graphics audio, video or graphics objects in tandem with the video 
displays either dominate the entire screen, as in the case of and audio play of a digital video presentation. The presen- 
many electronic program guides or menus, or sections tation thus enhanced may be available via a broadcast or in 
thereof. The video behind these graphic overlays is entirely 20 a video-on-demand scenario. The video distribution system 
or partially obscured, thereby interfering with the viewing over which the video is made available can be a one-way 
experience. Systems for the presentation of electronic pro- system, such as a terrestrial television broadcast, or a 
gram guides, such as described in U.S. Pat. Nos. 5,737,030, two-way communication, such as a hybrid fiber/coaxial 
5,592,551, 5,541,738, and 5,353,121, display these guides cable system with return channel capability, 
either on a screen devoid of video or one which uses a still 25 The invention enables the tandem presentation of addi- 
frame or moving video simply as a background, with no Uona ] au dio, video, or graphics by defining video and audio 
coordination between the location of items in the video and "holes" in the video or audio presentation at which there is 
the location of graphics overlays. no significant video or audio activity. "Holes" are locations 

Currently, Viewers' equipment, such as set-top boxes and times in the video presentation. Graphics or audio 

(STB), does not have the capability to determine where 30 objects are appropriately presented by the STB in those 

objects are located in the video. Determination of object's "holes". The STB is notified as to the location and/or times 

location in a video is necessary in order to place the graphics associated with these "holes", as well as other information 

objects, such as the on-screen text or animate characters, in which characterizes the material which the STB must 

locations which do not interfere with objects appearing in present. 

the video presentation. 35 With this information, this invention allows the STB to 

Systems such as the one described in U.S. Pat. No. judiciously place graphics objects on screen or play audio or 

5,585,858 attempt to coordinate video and graphic displays video content, and avoid interference with video objects or 

by including in the broadcast stream, or pre-storing at the audio events. Hie graphics objects displayed by the STB can 

viewers' equipment, graphic overlay screens designed to be ^ be static or dynamic, i.e., animated. Thus, the invention also 

compatible with the video content. However, these screens enables the creation of video presentations in which objects 

must be created well in advance of the presentation, and thus in the original video or animation interact and move in 

lack the flexibility to create and display non -interfering tandem with video or graphics objects which are added by 

graphics overlays adaptively. In addition, those systems the viewer's equipment. For example, a cartoon may be 

display graphics at specific "trigger points" in the 45 created in which several characters are seen on screen at 

presentation, not at arbitrary points throughout the presen- once and a "hole" is left for the addition of an animated 

tation. character which is added by the viewer's equipment such as 

Other systems which add graphics or audio content to an an STB. 

existing presentation, such as described in U.S. Pat. No. Alternatively, the "hole" could be defined at the location 

5,708,764, require the active participation of the viewer in 50 of a relatively less important character which can be 

the process of presentation. The viewer, for example, may be obscured by the STB-animated character. The viewers 

required to answer a number of questions before or during whose STB does not support the present invention will still 

the presentation, the responses are then displayed on the be able to see a presentation with no video "holes". The 

screen at predetermined times. information as to what type of character can be added, at 

Systems which allow the personalization of content for S 5 whal localions > at whal ^mes, and optionally, the 

individual users are well known in the context of Web motion of the added character must be delivered to the STB 

browsing. Other systems, such as systems described in U.S. in advance of the display of the character. 

Pat. Nos. 5,585,858 and 4,616,327, provide a limited num- Similarly, the invention allows tandem audio play 

ber of introductions, by the viewers' equipment of prede- between the audio content of the presentation and audio 

termined text or graphics. Some systems, such as described 60 content which is introduced by the STB. 

in U.S. Pat. Nos. 4,839,743, 4,786,967, and 4,847,700, The invention allows for the personalizatioD of the video, 

provide audio and/or video personalization through the graphics or audio content introduced by the STB. The 

selection among a small number of alternate video and audio personalization is achieved by a viewer when he or she 

tracks which are broadcast simultaneously. The selection is specifies several personal parameters, such as name and age 

performed at the viewer's equipment. 65 through a viewer interface. To continue the above example, 

What is needed is a system whereby the location and a child's name may be entered in the STB's personalization 

timing of video objects and audio events are made available information. When viewing the prepared presentation, the 
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STB-aiiimated character can display this child's name, when and broadcast along with the video and audio. The choice of 

this character is presented in the location of video "holes". text for display can be based on personalization information 

Alternatively, the STB can play an audio clip of the child's already stored in the STB. 

name during audio "holes." Personalized audio or video The authoring system accepts as input video/audio con- 
clips may be recorded and stored in the STB for use in the 5 tent. An author steps through the content, marking locations 
tandem play. of video and/or audio "holes." The markings thus created are 
Thus, the present invention allows a single version of ^cd by the authoring system to create control information 
material such as a cartoon presentation to be created and describing these "holes", which is inserted into the video/ 
broadcast, yet be viewed and heard differently by various au di° content. 

viewers, and tailored to them specifically. A hybrid presen- 10 In tne preferred embodiment, the control information 

tation is in effect created, the sum of the original presenta- takes tne form of HTML tags which indicate: 

tion and the graphics and/or audio which is introduced by the 1. "hole" identifier used to coordinate "hole" with inser- 

viewers' STB into the "holes." tion application. 

Accordingly, in the present invention personalization 2. "hole" type, e.g., video or audio, 

information, audio and video segments and possibly "hole" 15 3. beginning time of "hole", 

information are stored in the STB. The STB receives a 4. ending time of "hole", 

multimedia presentation stream embedded with "hole" 5. beginning screen location of "hole", e.g., x, y coordi- 

information. The "hole" information is embedded into the na tes in video, 

stream during an authoring stage, where the creator of the 6 cnding screen location of "hole" e.g., x, y coordinates 

presentation determines the "hole" locations and times. That m y^eo 

"hole" information is extracted on the STB, and audio and ? motion vectQr for « hole „ movement m 

video segments and peisonali^ 8 description of bitmaps) to be insert in video "hole", 

stored on the STB, are coordinated with the "holes and ^ r ^ ' 

displayed in tandem with the multimedia presentation. rt , . _ . , 

r J 25 9. volume level for inserted audio. 

BRIEF DESCRIPTION OF DRAWINGS automatic object recognition may be incorporated into 

the authoring system to simplify the authoring process. An 

FIG. 1 is a view of a monitor screen displaying an author specifies the initial location of a video object, e.g., a 

animated presentation with the location of a video "hole" less-significant character, and its subsequent locations are 

indicated. 30 detected by the authoring system, which inserts appropriate 

FIG. 2 is the view of the same screen as FIG. 1, with the control information into the stream as the object moves, 

addition of an STB-animated character in the video "hole" For digital video streams, the Motion Pictures Experts 

location. GroupfjjJWgECj!?) compression for audio and video signals, 

FIG. 3 is a flowchart showing steps involved in extracting and MPEG-2 Systems transport for the transport of those 

and processing a "hole" information from a multimedia 35 si S Dals ma X be Because of ^ hl S h b " r ? te 

presentation stream. ments of dl & iaX vldeo » a compression method is usually 

™_ - . ... A c applied to a video before transmission over a network. In the 

FIG 4 shows typical eqmpment necessary for the extrac- embodime video and audio m com . 

tion of "hole mfonnauon and display of tandem content. ^ MpE(J 2 as ^ ISO/ 

DETAILED DESCRIPTION OF THE 40 ^ lSSJS-a for vide ? "id lSO/IEC 13818-3 for audio. 

iKn/r\mnM The MPEG-2 st andanLalso s pecifies how presentations 

INVENTION . . _ mmmjftnmm m. .1 i _ „■ i \ , 

consisting ^FtaiTn i n vek"™?' * me n X? tyx s * rp.am^ can be 

The steps necessary to prepare and to play a presentation multiplexed together in a ^ransrXSSfstg^m" . This is speci- 

with tandem STB video graphics display and/or audio or fied in the MPEG-2 Systems Specification, ISO/IEC 13818- 

video play according to the invention include: 45 1. The MPEG-2 Systems Specification accommodates the 

1. defining video and audio "holes" during an authoring inclusion in a presentation's transport stream of non-video 
stage and embedding them as part of control informa- and non-audio streams, by use of ^ny^tetdata" streams. All 
tion in the presentation stream with video and audio; transport stream packets, regardless of content, a re of a 

- , .... t cm uniform size (188 bytes) and format. ^i^ramrSpecific 

2. performing personalization on viewer s STB; *r e <*^. „ • j • .u * - 

y 6 v . so InCormati on y, which is also earned in the transport stream, 

3. delivering the presentation stream to viewer's STB; carrieslhTinformation regarding which elementary streams 

4. extracting the control information from the presentation have been multiplexed in the transport stream, what type of 
stream and parsing by the STB; and content they carry, and how they may be demultiplexed. In 

5. displaying video and audio of the presentation stream this embodiment, the control information is carried in an 
together with graphics, audio, or video objects provided 55 MPEG-2 Transport Stream private data stream. 

by the STB during the time and location of the "holes". In the embodiment utilizing MPEG-2 video, beginning 

Authoring Stage and ending times for "hole" specification are specified in 

In order to specify the location and time of video and terms of the Presentation Time Stamp (PTS) of the frames 

audio "holes", a video presentation must be marked with where the "hole" appears. PTSs are typically present in 

control information. In the preferred embodiment, this is 60 every frame to every third frame, and this is sufficient for 

done offline, through the use of an authoring system synchronization, since the frame rate for NTSC video is 30 

designed for this marking process and described in U.S. frames/second. Video "holes" are rectangular, and thus 

patent application Ser. No. 09/032,491. specified by a pair of (x, y) coordinates. Other embodiments 

Tne control information may also be added in real time to may use more complex polygons to describe video "hole" 

a live presentation in progress, by specifying video "holes" 65 shape, and require more coordinates and a specification of 

to the STB. The STB will use this information to display text which polygon is to be used. The video "hole" movement is 

associated with the program, e.g., news or a sports program, linear between the beginning and ending screen location. 
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Again, more complex functions may be specified in other 
embodiments to describe video "hole" movement. 
Delivery Stage 

The control information may be expressly created for the 
function of the present invention with "holes" left in the 
video and/or audio for insertion of the content by the STB. 
In order to show a full presentation to those viewers whose 
STB does not support the present invention, "holes" may 
actually be a default unit of video or audio content. Presen- 
tations which were not designed for the present invention 
may be retrofitted to accommodate it, i.e., "holes" may be 
found in the existing content areas and/or sounds which can 
be overlaid. 

After forming the control information, the video presen- 
tation together with such control information may be trans- 
ported to the viewer's STB by being sent 

a. in the video blanking interval of an analog video signal 
and extracted by the viewers* equipment in a manner 
similar to that used for closed -caption information; 

b. in a separate Vestigial Side Band channel; 

c. within a digital video/audio stream, and extraction of 
embedded data is performed by the viewers* equipment 
in a manner similar to that used for the extraction of 
video or audio streams. 

The STB 

FIG. 4 shows typical equipment necessary for the present 
invention. It comprises a television set or a monitor screen 
4, cable 6 to receive the multimedia presentation, the STB 5 
to accept, process and to forward the resulting presentation 
over cable 7, to be displayed on the monitor screen 4. 
MPEG-2 demultiplexers, MPEG-2 audio decoders and 
MPEG-2 video decoders are now widely available. The 
C-Cube C19110 Transport Demultiplexer, C-Cube C19100 
MPEG-2 Video Decoder, and Crystal Semiconductor 
CS4920 MPEG Audio Decoder are examples. In the pre- 
ferred embodiment, the video and audio decoders may be 
implemented together in a single chip, such as the IBM 
CD21 MPEG-2 Audio/Video decoder. If not incorporated in 
the audio and video decoder, an intermediate IC is necessary 
at the output of the decoders to convert from digital to 
analog and, in the case of video, encode to the desired video 
analog signal format such as NTSC, PAL, or SECAM. 
S-video output from these IC's is optional. 

The on-screen graphics objects which overlay video con- 
tent are rendered using the on-screen display (OSD) func- 
tions of the MPEG-2 Video Decoder in the STB. These 
decoders vary in the sophistication of the OSD which they 
offer and in the application program interfaces (API) which 
are used to control the OSD. Individual pixels can be 
addressed, and bitmaps are used for many text and graphic 
objects. A minimum level of OSD graphics capability offers 
16 colors. A preferred capability offers 256 colors and 
multi-level blending capability. The blending capability of 
the OSD allows for varying degrees of opacity for the 
graphics overlay. 

Overlay of audio content is performed by the STB audio 
decoder in the case of MPEG audio or by the STB processor 
utilizing an API to a media player. File formats supported by 
this player include ".wav", ".rmi", and ".mid". Alternatively, 
the audio playing function can be incorporated into the 
STB's application itself. Video replacement or addition can 
be performed by an additional video decoder in the STB. 
Systems with "picture-in-picture" capability can use this 
feature for addition or replacement of video objects. 

In either case, the audio being played is mixed with or 
preempts the original audio of the presentation, utilizing the 
STB's audio output. In another embodiment, one in which 
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two tracks of audio are available, one for music and one for 
dialogue, the STB can replace the content of the latter tract 
while allowing the former to continue as usual. 

The presentation of the present invention, which is to be 
viewed, may be broadcast using the NTSC or PAL for analog 
or ATSC or DVB for digital television standards. In another 
embodiment, the presentation may be viewed and controlled 
on a per- users basis, as with a video-on-demand systems or 
viewing from a video tape. 

The processing power needed to implement the present 
invention can be easily accommodated by the processing 
capabilities of the processors in most current STB's, which 
start at roughly 1 MIP. This processor runs the video/audio 
content insertion application, and controls the use of the 
OSD and audio functions. 

An STB 5 typically has between 1 and 4 MB of RAM. 
The program of the present invention needs to be down- 
loaded to or stored in the RAM of the STB, it would occupy 
approximately up to 0.5 MB. 

Only a small amount of the STB 5 storage is required to 
store personalization information for all viewers in a house- 
hold. In the preferred embodiment, personalization informa- 
tion for each viewer includes: 

1. name, 

2. age, 

3. content restrictions, e.g., PG-13, 

4. text preference, e.g., large type, 

5. enable audio replacement, 

6. enable video replacement, and 

7. pointer to sprite associated with viewer. 

This information needs to be stored in non-volatile 
memory in order to persist when the viewers' STB is 
powered off or during power failures. Typical STB's have 
non-volatile RAM for this purpose. 

FIGS. 1 and 2 provide example screen displays according 
to a presentation prepared initially for a tandem play. FIG. 
1 shows a screen 10 of an animated program with one video 
character 20. The location of a "hole" 30 is indicated by 
dotted lines 40. The dotted lines 40 around the "hole" 30 are 
only illustrative, and would not appear in the actual pro- 
gram. Control information concerning the location of the 
"hole" 30 is embedded in the video stream and extracted by 
the STB. 

FIG. 2 shows the same screen with the addition of an 
STB -animated character 50 which is displayed in the loca- 
tion of a "hole" 30. Alternatively, the STB could have used 
the "hole" 30 for display of graphics text describing the 
character, for example. 

It is also possible to prepare for a presentation utilizing a 
mechanism that looks for locations of "holes" 30 which 
occur naturally in the audio and video presentation. 
Alternatively, "holes" 30 may be created in a presentation by 
blanking out sections of the existing audio track or obscur- 
ing sections of the video screen. 

The logical flow of the application which is loaded into 
the STB and used to parse control data of the video presen- 
tation stream and to display information stored in the STB 
in the "boles" 30 of the presentation, is shown in FIG. 3. The 
Program Specification Information (PSI) of the current 
presentation is parsed at step 80. A determination is made at 
step 81 whether any control information with "holes" loca- 
tions will be arriving with this presentation. If the informa- 
tion will not be arriving, the program control returns to step 
80, and the next presentation will be parsed. If the informa- 
tion will be arriving, then at step 82 demultiplexer queues 
are setup to receive it. At step 83, a determination is made 
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whether the control data has arrived in demultiplexer 
queues, if not, the test at step 83 is repeated. When the 
information has arrived at the queues, it is parsed at step 84 
to ascertain the HTML tags. At step 85 the HTML tags are 
matched with the "hole" information. If there is no match, 
the program control returns to step 83. If there is a match, 
step 86 assigns the received data to associated variables, and 
returns program control to step 83. 

When all the information about "holes" and the overlay 
information is parsed and assembled in the STB, then it 
becomes a straight forward, commonly known task of the 
STB to overlay content at given "hole" coordinates with 
overlay data while displaying the presentation stream on a 
video monitor. A similar process applies to audio "boles." 

While the invention has been particularly shown and 
described with respect to illustrative and preferred embodi- 
ments thereof, it will be understood by those skilled in the 
art that the foregoing and other changes in form and details 
may be made therein without departing from the spirit and 
scope of the invention that should be limited only by the 
scope of the appended claims. 

Having thus described our invention, what we claim as 
new, and desire to secure by Letters Patent is: 

1. A method for displaying an enhanced multimedia 
presentation including personalized supplementary audio, 
video, and graphic content selectable by a user and rendered 
by a receiving device, the method comprising: 

communicating a multimedia presentation file to said 
receiving device, said multimedia presentation file 
comprising base multimedia presentation content and, 
frame-synchronized information including starting 
frame timing identifier, ending frame timing identifier, 
starting frame spatial coordinates, ending frame spatial 
coordinates, and motion vector specifications for 
describing frame-accurate location, motion and timing 
of said personalized supplementary audio, video, and 
graphic content, said frame-synchronized information 
indicating one or more free areas of said multimedia 
presentation absent significant base multimedia con- 
tent; 

extracting said frame-synchronized information from said 
multimedia presentation file; 

retrieving said personalized supplementary audio, video 
and graphic content from said receiving device; 

decoding said personalized supplementary audio, video 
and graphic content at a time sufficiently in advance of 
said starting frame timing identifier; and 

said receiving device selecting an indicated free area and 
initiating display of one or more items of said person- 
alized supplementary audio, video and graphic content 
at frame-accurate times between said starting frame 
timing identifier and ending frame timing identifier at 
said frame coordinates in accordance with said frame- 
synchronized information. 

2. The method of claim 1, wherein said supplementary 55 
audio, video, and graphic content is stored in said receiving 
device. 

3. The method of claim 2, wherein said supplementary 
audio, video, and graphic content is communicated with said 
multimedia presentation base content. 

4. The method of claim 3, wherein said frame- 
synchronized information is determined and embedded in 
said multimedia presentation file in an authoring step prior 
to the communication step. 

5. The method of claim 4 wherein said frame- 
synchronized information is allowed to be altered in said 
receiving device via a user interface. 
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6. The method of claim 5, wherein said frame- 
synchronized information further includes: an identifier for 
coordination with an video/audio content insertion 
application, a media type, and, description of a bitmap if said 
video is to be inserted, and volume level if audio is to be 
inserted. 

7. The method of claim 6, wherein said frame- 
synchronized information is defined in such a way that 
displaying of said supplementary audio, video and graphic 
content will not interfere with viewing of said multimedia 
presentation base content. 

8. The method of claim 7, wherein said frame- 
synchronized information is defined in frame-synchronized 
coordination with visible objects in said multimedia presen- 
tation base content. 

9. The method of claim 8, wherein said frame- 
synchronized information is defined in such a way that 
supplementary audio play can be performed without inter- 
fering with the sound of said multimedia presentation base 
content. 

10. The method of claim 9, wherein said frame - 
synchronized information is defined in such a way that 
supplementary audio content can be introduced in coordi- 
nation with the audio units of said multimedia presentation 
base content. 

11. The method of claim 10, wherein said frame - 
synchronized information is used in displaying said supple- 
mentary audio, video, and graphic content in such a way as 
not to interfere with the viewing or hearing of said multi- 
media presentation base content. 

12. The method of claim 11, wherein said frame- 
synchronized information is used in displaying said supple- 
mentary audio, video, and graphic content which are coor- 
dinated with base audio, video and graphic content of said 
multimedia presentation, forming a hybrid of coordinated 
presentation from the conjunction of said base multimedia 
presentation content and said supplementary audio, video, 
and graphic content. 

13. The method of claim 12, wherein personalization 
information is stored in said receiving device via said user 
interface. 

14. The method of claim 13, wherein said personalization 
information includes: said viewer's name, said viewer's age, 
content restriction for said viewer, text preference, audio 
replacement enablement switch, video replacement enable- 
ment switch, and a pointer to a sprite associated with a 
viewer. 

15. A computer program device readable by a machine, 
tangibly embodying a program of instructions executable by 
a machine to perform method steps for displaying an 
enhanced multimedia presentation including personalized 
supplementary audio, video, and graphic content selectable 
by a user and rendered by a receiving device, the method 
comprising: 

communicating a multimedia presentation file to said 
receiving device, said multimedia presentation file 
comprising base multimedia presentation content and, 
frame-synchronized information including starting 
frame timing identifier, ending frame timing identifier, 
starting frame spatial coordinates, ending frame spatial 
coordinates, and motion vector specifications for 
describing frame-accurate location, motion and timing 
of said personalized supplementary audio, video, and 
graphic content, said frame-synchronized information 
representing free areas of said multimedia presentation 
absent significant base multimedia content; 

extracting said frame-synchronized information from said 
multimedia presentation file; 
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retrieving said personalized supplementary audio, video coordination with visible objects in said multimedia presen- 

and graphic content from said receiving device; tation base content, 

decoding said personalized supplementary audio, video 23. The method of claim 22, wherein said frame- 

and graphic content at a time sufficiently in advance of synchronized information is defined in such a way that 

said starting frame timing identifier; and 5 supplementary audio play can be performed without inter- 
selecting an indicated free area and initiating display of faring with the sound of said multimedia presentation base 

one or more items of said personalized supplementary content. 

audio, video and graphic content at frame-accurate 24. The method of claim 23, wherein said frame- 
times between said starting frame timing identifier and synchronized information is defined in such a way that 
ending frame timing identifier at said frame coordinates 10 supplementary audio content can be introduced in coordi- 
in accordance with said frame-synchronized informa- nation with the audio units of said multimedia presentation 
tion. base content. 

16. The method of claim 15, wherein said supplementary 25. The method of claim 24, wherein said frame- 
audio, video, and graphic content is stored in said receiving synchronized information is used in displaying said supple- 
device, mentary audio, video, and graphic content in such a way as 

17. The method of claim 16, wherein said supplementary not t0 mterfere ^ the viewing or hearing of said, multi- 
audio, video, and graphic content is communicated with said media presentation base content. 

multimedia presentation base content. u ^ method of daim 25> whercin ^ frame . 

18 TTie method of claim 17, wherein said frame- s chroai ^ d mforma tion is used in displaying said supple- 
synchronized information is determined in an authoring step ^ ^ whicQ m ^ 
prior to the communicator, step. ^ ^ ^ of ^ 

19. The method of claim 18, wherein said frame- , . . £ • . l -j r -j-.j 
synchronized information is allowed to be altered in said multimedia presentation, forming a hybnd of coordinated 
receiving device via a user interface. presentation from the conjunction of said base multimedia 

20. The method of claim 19, wherein said frame- 25 presentation content and said supplementary audio, video, 
synchronized information further includes: an identifier for an£ * g ra P mc content. 

coordination with an video/audio content insertion 27. The method of claim 26, wherein personalization 

application, a media type, and, description of a bitmap if said information is stored in a receiving device via said user 

video is to be inserted, and volume level if audio is to be interface. 

inserted. 30 28. The method of claim 27, wherein said personalization 

21. The method of claim 20, wherein said frame- information includes: said viewer's name, said viewer's age, 
synchronized information is defined in such a way that content restriction for said viewer, text preference, audio 
displaying of said supplementary audio, video and graphic replacement enablement switch, video replacement enable- 
content will not interfere with viewing of said multimedia ment switch, and a pointer to a sprite associated with a 
presentation base content. viewer. 

22. The method of claim 21, wherein said frame- 
synchronized information is defined in frame-synchronized ***** 
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